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ABSTRACT 

R2 non-LTR retrotransposons insert at a specific site 
in the 28S rRNA genes of many animal pliyla. R2 
elements encode a single polypeptide with reverse 
transcriptase, endonuclease and nucleic acid bind- 
ing domains. Integration involves separate cleavage 
of the two DNA strands at the target site and utiliza- 
tion of the released 3' ends to prime DNA synthesis. 
Critical to this integration is the ability of the pro- 
tein to specifically bind 3' and 5' regions of the R2 
RNA. In this report, alanine mutations in two con- 
served motifs N-terminal to the reverse transcriptase 
domain were generated and shown to result in pro- 
teins that retained the ability to cleave the first strand 
of the DNA target, to reverse transcribe RNA from 
an annealed primer and to displace annealed RNA 
when using DNA as a template. However, the mutant 
proteins had greatly reduced ability to bind 3' and 
5' RNA in mobility shift assays, use the DNA target 
to prime reverse transcription and conduct second- 
strand DNA cleavage. These motifs thus appear to 
participate in all activities of the R2 protein known to 
require specific RNA binding. The similarity of these 
R2 RNA binding motifs to those of telomerase and 
group II introns is discussed. 

INTRODUCTION 

Non-LTR letiotransposable elements, also known as long 
interspersed nuclear elements (LINEs), are highly abun- 
dant genomic parasites of eukaryotes. One of the best- 
characterized members of this class is R2, an element which 
exclusively inserts into a highly conserved sequence of the 
28S rRNA genes (1,2). The extreme sequence specificity of 
R2 for a site in the 28S rRNA genes has enabled detailed 
studies of its retro transposition mechanism (3,4). These 
studies have suggested that integration involves symmetri- 
cal reactions by R2 protein subunits bound upstream and 
downstream of the insertion site (Figure 1). Protein bound 



the 3' end of the R2 RNA transcript cleaves one strand of 
the DNA target and uses the cleaved end to prime reverse 
transcription of the R2 transcript. Protein bound to the 5' 
end of the RNA transcript cleaves the second DNA strand 
and again uses the released DNA end to prime synthesis of 
the second DNA strand. This integration mechanism has 
been termed target primed reverse transcription (TPRT) (3). 

The reverse transcriptase encoded by R2 (R2-RT) has un- 
usual properties when compared to the RTs of retroviruses. 
First, priming of DNA synthesis from the 3' hydroxyl ex- 
posed by the DNA cleavage does not require complemen- 
tarities between the DNA target and the RNA template 
(5,6). Second, R2-RT is highly processive, presumably be- 
cause release of the RT from the RNA template would re- 
sult in a truncated insertion (7). Third, upon reaching the 
end of an RNA template the R2-RT can add up to 5 non- 
templated nucleotides and use microhomology to jump to 
another RNA or DNA template (8). Finally, R2-RT can use 
the DNA of an RNAiDNA heteroduplex as template, dis- 
placing the RNA strand (9,10). One property that R2-RT 
shares with retroviral RT is the ability to extend past nu- 
cleotide mismatches resulting in an error rate similar to that 
oftheRTfrom HIV-1 (11). 

In addition to the specificity of the R2 endonuclease for 
the 28S gene target site, R2-RT has high specificity when 
binding RNA. The only RNA sequences utilized in the 
TPRT reaction are those that contain the 3' untranslated 
region (3' UTR) of R2 (5). This 3' UTR RNA can be folded 
into a precise structure that has been shown to be conserved 
across related species (12,13). While enzymatic and DNA 
binding domains of the R2 protein have been previously 
characterized, nothing is known of the location or number 
of the RNA binding domains. Two model systems with po- 
tential clues to identifying the RNA binding domains in R2 
are mobile group II introns and telomerase. Mobile group 
II introns are self-splicing retroelements found in bacteria 
and organellar genomes (14). These introns encode an en- 
zymatically active RNA (ribozyme), which catalyses its own 
splicing from a co-transcript, and reverse splicing of that in- 
tron into a new DNA target site. These introns also encode 
a protein, which helps to fold the intron RNA (maturase). 
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Figure 1. TPRT model for the integration of R2 elements (3,4). DNA tar- 
get site, solid black line; R2 RNA transcript, solid gray line; newly synthe- 
sized R2 DNA, dashed black line; gray oval, R2 protein. Association of the 
R2 protein with the 3' end of the R2 transcript directs DNA binding of the 
protein upstream of the 28S rRNA gene insertion site. This protein cleaves 
the bottom strand of DNA and uses the released 3' end to prime reverse 
transcription of the R2 transcript (cDNA synthesis) directly onto the tar- 
get site. Association of the R2 protein with the 5' end of the R2 transcript 
directs DNA binding of the protein downstream of the insertion site. This 
protein cleaves the top strand of DNA and uses the released 3' end to prime 
synthesis of the second DNA strand displacing the RNA strand while us- 
ing the cDNA as template. Finally, DNA repair seals the nicks remaining 
at each end of the insertion. 



cleave the target site (endonuclease) and reverse transcribe 
the reverse-sphced RNA template using the new target site 
as primer (TPRT) (15). While telomerase is not encoded by 
a mobile element, this protein (TERT) binds a cellular RNA 
(TER) and catalyzes a reaction similar to a TPRT reaction 
(16). Namely, the 3' end of a chromosome is used to repeat- 
edly prime reverse transcription of a short sequence of TER, 
thereby, forming the tandemly repeated telomeric sequences 
found in most eukaryotes. 

Therefore, the R2 protein, group II intron protein and 
telomerase share the property of specifically binding an 
RNA molecule that will be used as a template for reverse 
transcription primed by the 3' end of a DNA strand. Crit- 
ical RNA binding domains for both group II introns and 
telomerase have been shown to be immediately N-terminal 



to the fingers region of their reverse transcriptase domain 
(17-19). In this report, we show that the R2 protein also en- 
codes RNA binding motifs immediately N-terminal of its 
RT domain. Mutations of these motifs do not affect recog- 
nition and first strand cleavage of the DNA target site or the 
basic properties of the RT, but they do affect the ability of 
the enzyme to conduct the TPRT reaction, bind the 3' or 5' 
end of R2 RNA and cleave the second DNA strand. 



MATERIALS AND METHODS 

Mutagenesis and purification of wild type and mutant R2 pro- 
teins 

Alanine replacement mutations of the Bombyx mori R2 
expression construct pR260 (3) in the -1 region (R310A, 
R311A, Q318A) and in the 0 region (G408A, D410A) 
were generated by the QuickChange Lightning Multi Site- 
Directed Mutagenesis kit (Stratagene) using mutagenic 
primers with point mutations in the appropriate codons. 
Mutant constructs were sequenced, transformed into Es- 
cherichia co// JM109 and proteins purified as described pre- 
viously (3,10) with the following modifications. Protein ex- 
tracts from the high speed centrifugation were diluted to 0.3 
M NaCl and loaded onto a 20 ml Q-Sepharose column, 
washed extensively and eluted with 0.5 M NaCl. Protein 
fractions of 1 ml volume were collected and tested for re- 
verse transcriptase activity on poly (rA)/poly (dT)i3_i8 tem- 
plates with a-^'^P-labeled dTTP (10). Aliquots from each 
fraction were also run on a 1% agarose gel stained with 
ethidium bromide to determine the fractions containing E. 
coli RNA. Fractions with high reverse transcriptase activ- 
ity without co-purifying RNA were pooled and dialyzed 
against R2 storage buffer for 3 h at 4''C. The protein was 
stored in 0.1% Triton X-100 and 0.1 mg/ml bovine serum 
albumen at — 20°C. Protein concentrations were determined 
by silver staining (BioRad) of sodium dodecyl sulphate 
(SDS) polyacrylamide gels using bovine serum albumen as 
the protein concentration standard. 



Preparation of DNA and RNA substrates 

A 110-bp fragment of the 28S rRNA gene was used as 
the target DNA substrate for all cleavage, TPRT and elec- 
trophoretic mobility shift assays (EMSA). This DNA sub- 
strate was generated by PCR amplification from clone 
pB109 (3) with a forward primer (5' AATTCAAGCAAG 
CGCGG 3') complementary to a region 50 bp upstream of 
R2 insertion site and reverse primer (5' CTAAGGATCC 
CGTTAAT CCATTCATG 3') complementary to a region 
60 bp downstream of the R2 insertion site. The target DNA 
substrates were 5' end-labeled on either the bottom strand 
or on both strands by end-labeling the reverse or both PCR 
primers using 7 ^^P-ATP (Perkin-Elmer/Life Sciences, 6000 
mCi/mmol). PCR ampHfications, gel purification and elu- 
tion of target DNA was performed as described previously 
(20) with the final DNA pellet dissolved in water and stored 
at — 20°C. The 320-single stranded DNA template was de- 
rived from the M13mpl8 vector (Invitrogen) as described 
previously (10). R2 3'RNA and 5'RNA were generated in 
vitro using T7 RNA polymerase and purified as described 
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previously (9,10). The 100 bp non-specific (NS) RNA con- 
tained only M13mpl8 sequences also as described previ- 
ously (10). 

DNA cleavage and TPRT assays 

All assays were performed in a 13 |jl1 volume containing 10 
mM Tris-HCl (pH 8.0), 200 mM NaCl, 5 mM MgCb, 1 mM 
dithiothreitol, 0.1 mg/ml bovine serum albumin, 0.01% Tri- 
ton X-100, 10%-12% glycerol with 4 ng labeled target DNA 
and variable amounts of R2 protein as described in the fig- 
ure legends (4,20). The cleavage assays were performed by 
pre-incubating the R2 protein with 1 |jLg of RNase A or with 
100 ng of 3'RNA, 5'RNA or non-specific RNA for 5 min at 
room temperature followed by incubating with target DNA 
for 20 min at 37°C. TPRT assays were performed by pre- 
incubating the R2 proteins at room temperature for 5 min 
with increasing amounts of 3' RNA (12.5-100 ng), followed 
by incubating with target DNA and 25 |jlM of each dNTPs 
for 20 min at 37°C. Reactions were stopped by the addi- 
tion of three volumes of 95% (v/v) ethanol containing 0.3 
M sodium acetate pH 5.3 and 0.1% (w/v) SDS. After pre- 
cipitation the products were resuspended and incubated at 
95°C for 5 min in 10 jjlI 40 mM ethylenediaminetetraacetic 
acid (EDTA), 99% formamide loading dye. The products 
were separated on 8% polyacrylamide-urea gels and quan- 
tified using a Phospholmager (BioRad). 

Primer extension assays 

The ability of wild type and mutant R2 proteins to ex- 
tend DNA primers annealed to either RNA or DNA tem- 
plates was assayed as described previously (11). Polymer- 
ization was initiated with 250 |jlM of each dNTPs with the 
amounts of substrate and protein used as described in the 
figure legends. Reactions were terminated and separated on 
14% polyacrylamide-urea gels (1 1). 

Strand displacement assays 

The primer, 5' CTGGCGAAAGGGGGATGTGC 3', was 
5' end labeled and annealed to a 320 nucleotide single 
stranded DNA template or this DNA template annealed to 
a 100 nt RNA block as described previously (10). Substrates 
were pre-incubated with the R2 proteins or T4 DNA poly- 
merase (Fermentas) for 10 min at room temperature. Reac- 
tions were initiated by addition of 250 jjlM each dNTPs, in- 
cubated at 37°C for 20 min and the reactions stopped by the 
addition of three volumes of 95% (v/v) ethanol containing 
0.3 M sodium acetate pH 5.3 and 0.1% (w/v) SDS. Products 
were separated on 8% polyacrylamide-urea gels. 

Electrophoretic mobility shift assays 

Reaction conditions were as described for the DN A cleav- 
age and TPRT assays. Reactions were first pre-incubated 
with 1 jjLg of RNase A or 100 ng of 3' or 5' RNA for 5 
min at room temperature followed by incubating with tar- 
get DNA for 20 min at room temperature. Reactions were 
chilled on ice and run on 5% native polyacrylamide gel as 
described previously (4,20). EMSA assays were performed 



in either the presence or absence of 0.05 mM EDTA to 
inhibit/allow DNA cleavage. The non-specific competitor 
poly (dIdC) was added to 25 |JLg/ml in all reactions. 

RESULTS 

Conserved motifs upstream of the RT domain 

R2 elements are prevalent in most arthropod lineages ( 1) as 
well as other metazoans as diverse as vertebrates and hydra 
(2). All R2 elements contain a single open reading frame 
that encodes a 1100-1200 amino acid protein with similar 
conserved domains (Figure 2). The central one-third of the 
R2 protein contains the motifs that have been identified in 
the fingers, palm and thumb subdomains of various reverse 
transcriptases (21). C-terminal to the RT domain is a region 
encoding a zinc-finger motif and the active site of a restric- 
tion enzyme-like endonuclease (22). N-terminal to the RT 
domain is a DNA binding domain containing a c-myb and 
one to three zinc finger motifs (23). The region between the 
DNA binding domain and the RT domain is somewhat vari- 
able in length with only low levels of amino acid sequence 
conservation (24). 

Because in both mobile group II introns and telomerase 
an RN A binding domain has been identified N-terminal to 
the fingers and palm subdomains of the RT (18,19), this 
region of the R2 Open Reading Frame (ORE) was com- 
pared in all available R2 sequences. The alignment in Fig- 
ure 2 contains representative R2 sequences from 10 arthro- 
pods, two vertebrates and a hydra as well as multiple R2 lin- 
eages from a tunicate. Sequence conservation was detected 
in two regions, each ~35 aa in length, separated by a 39-51 
aa region with lower sequence identity. The region nearest 
the RT domain contained the highly conserved PGPDG se- 
quence, which had previously been suggested to be part of 
the RT domain of all non-LTR retrotransposons (24,25). 
In these earlier reports, this motif was referred to as the 0 
motif as it was N-terminal to the core motifs (labeled 1- 
7) that are conserved in all reverse transcriptases. The sec- 
ond region of the R2 proteins had less sequence constraints 
but contained a conserved interspersion of charged and hy- 
drophobic residues and will be referred to as the -1 region. 

Three conserved residues within region -1 and two 
residues in region 0 were individually mutated to alanine 
residues, and the mutant proteins tested for their ability to 
conduct the TPRT reaction with the 3' UTR RNA of R2 
(described below). While a slight reduction in product levels 
was sometimes detected (data not shown) all mutant pro- 
teins retained significant ability to conduct the TPRT re- 
action. More dramatic reductions were obtained by com- 
bining the three alanine mutations in the -1 region into one 
construct and the two mutations in the 0 region into a sec- 
ond construct. The activities of these two mutant proteins 
are described below and the proteins are referred to as the 
-1 and 0 region mutations. 

First strand DNA cleavage 

To insure the mutations did not influence the overall struc- 
ture or stability of the R2 protein, the -1 and 0 region mu- 
tant proteins were first tested for their ability to specifically 
cleave the first strand of the DNA target. Previous studies 
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Figure 2. Location and sequence of the -1 and 0 regions in R2 elements. The R2 sequences shown at the top is an amino acid sequence alignment of the open 
reading frames immediately N-terminal to the common conserved motifs found in all reverse transcriptases. Positions with the same or chemically similar 
amino acids are shaded. The 17 R2 sequences come from diverse arthropods (10 sequences), the tunicate Ciona intestinalis (4 sequences), two vertebrates 
(Petromyzon marinus and Danio rerio) and a hydra (Hydra magnipapillata). The alanine replacements generated for the -1 and 0 mutant proteins are shown. 
At bottom are diagrams of the R2 proteins from three diverse metazoans indicating the similar structure of R2 proteins from all animals. The encoded 
protein contains N-terminal zinc finger (Zn) and c-myb domains (23), the various subdomains of all reverse transcriptases (21,25,32) and C-terminal zinc 
finger and endonuclease domains (22). An alternate labeling of reverse transcriptase domains frequently used for telomerases are shown above the more 
generic terms (16,18). 



have shown that the R2 protein in the absence of RNA can 
cleave only one strand of the DNA target site (3). This first, 
or bottom, strand cleavage is used to prime the reverse tran- 
scription step. As shown in Figure 3A, the cleavage assay 
contained a 1 10-bp fragment of the 28S rRNA gene with the 
5' end of the bottom strand labeled (the DNA target site is 
drawn with the 5' end of the 28S gene on the left). Because 
of the high affinity of the R2 protein for the DNA target 
site after cleavage, the R2 protein is capable of only a single 
cleavage cycle (26). The target DNA was incubated with in- 
creasing amounts of R2 protein (Figure 3B) for 30 min at 
37°C, and the cleavage products were separated on an 8% 
denaturing polyacrylamide gel. The wild-type and mutant 
proteins generated similar levels of the 60 nucleotide cleav- 
age product, with ~50% of target DNA cleaved with 2 ng 
of protein (lane 3). These data suggested that the 0 and -1 
region mutations had minimal effect on the expression and 
purification of the R2 protein, its stability after purification 
and the ability of the protein to recognize and cleave the 
target DNA. 



Primer extension on RNA and DNA templates 

To determine if the mutations affected the basic nucleotide 
polymerase activity of the protein, the 0 and -1 mutants 
were tested for their ability to extend a DNA primer an- 
nealed to a template. As diagramed in Figure 4A, the ex- 
tension assays employed a 5' end-labeled DNA primer an- 
nealed to either an RNA or DNA template. These primer 
extension assays also monitored two other properties of the 
wild-type R2-RT. First, upon reaching the end of the tem- 
plate, R2-RT adds 3 to 5 additional (non-templated) nu- 
cleotides (8). Second, the polymerase can use microhomol- 
ogy between the non-templated nucleotides and the end of 
another template to continue DNA synthesis. This activity, 
referred to as template jumping, can occur with both the ex- 
cess primer and the single-stranded RNA or DNA template 
present in the assay (8). 

As shown in Figure 4B, the -1 and 0 mutant proteins 
were able to extend the primer and generate full-length (FL) 
products on both the RNA and DNA templates. By using 
the amount of bottom strand DN A cleavage to equilibrate 
the level of active R2 protein, the 0 region mutant protein 
generated on average ~20% fewer FL extension products 
compared to wild type and the -1 mutant in three different 
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Figure 3. First strand DNA cleavage assay. (A) Diagram of the assay. The R2 protein is incubated with a 1 10 bp DNA fragment containing the 28S gene 
target site. The bottom stand (also called first strand) is 5' end-labeled with P^^-. Cleavage generates a 60 nt labeled and a 50 nt unlabelled fragment. (B) 
Phosphoimager scan of cleavage reactions by the wild type, -1 and 0 mutant proteins separated under denaturing conditions on an 8% polyacrylamide-urea 
gel. For each reaction, 4 ng of target DNA was incubated with increasing amounts of protein for 20 min at 37°C. Lane 0, no protein; lane 1, 0.5 ng; lane 
2, 1 ng; lane 3, 2 ng; lane 4, 4 ng. 



protein preparations. Upon reaching the end of the tem- 
plate, both mutant proteins were similar to wild-type pro- 
tein in their ability to add up to 5 non-templated nucleotides 
(bands labeled NT directly above the FL band). However, 
while the -1 mutant protein retained the ability to jump from 
the initial template to another template and continue poly- 
merization, template jumps with the 0 mutant were reduced 
to less than 10% of the wild-type protein. These experiments 
suggest the mutations in the -1 region of the R2 protein had 
minimal affect on the functioning of the RT domain. On 
the other hand, mutations in the 0 region somewhat reduced 
the efficiency of simple extension reactions and more exten- 
sively reduced the ability of the RT to jump between tem- 
plates. 



Strand displacement during synthesis 

Another unusual property of the R2 polymerase is its abil- 
ity while synthesizing DNA to displace an RNA or DNA 
strand annealed to an RNA or DNA template (10). This 
activity is important in the R2 integration reaction because 
the R2 protein does not contain an RNase H domain (25), 
and no RNase H activity could be detected in in vitro as- 
says (27). Therefore, in the TPRT reaction after reverse tran- 
scription by R2-RT (first stand synthesis), the RNA tem- 
plate remains annealed to the cDNA and must be displaced 
during second strand synthesis. To exam the ability of the 
-1 and 0 region mutants to conduct strand displacement, 
a 100-nt RNA strand (RNA block) was annealed to the 5' 
end of a 320 nt DNA template. Polymerization was initiated 



with a 5' end labeled primer annealed to the 3' end of the 
DNA template (see Figure 5A). T4 polymerase, which lacks 
strand displacement activity (10), was used as a control to 
confirm that the DNA template was completely blocked by 
RNA annealed to the template DNA strand. 

As shown in Figure 5B, in the absence of the RNA block, 
T4 polymerase continued polymerization to the end of the 
DNA template generating a FL 320 nt product (band la- 
beled FL). In the presence of the RNA block, polymeriza- 
tion by T4 DNA polymerase was completely stopped at 220 
nt when the polymerase reached the DNA:RNA heterodu- 
plex (band labeled B). In contrast, wild-type R2 synthesized 
mostly FL product in the presence or absence of the RNA 
block. Indeed, processivity of the R2 polymerase is actually 
higher while displacing the RNA block, as there are fewer 
premature terminations in the duplexed region of the tem- 
plate (minor bands below the FL product). Similar to the 
wild-type R2 protein, both the -1 and 0 region mutants syn- 
thesized similar levels of FL product with and without the 
RNA block, suggesting there was no reduction in their abil- 
ity to displace RNA during the polymerization reaction. 



Target primed reverse transcription 

The critical step in the R2 integration reaction is the utiliza- 
tion of the nick on the bottom strand of the DNA target site 
to prime reverse transcription (TPRT) (Figure 1). Unlike a 
simple primer extension reaction, binding of the RNA tem- 
plate in this reaction is highly specific. The only templates 
that can be used for TPRT are RNAs that contain the 250 
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Figure 4. Primer extension assays on DNA and RNA substrates. (A) Diagram of the multiple products generated by the R2 polymerase in a primer 
extension assay. The reaction is started with a DNA primer annealed to the 3' end of a DNA or RNA template. After extending to the end of the substrate, 
the R2 polymerase can add up to 5 non-templated nucleotides and then use microhomologies between this extension and the end of another substrate to 
jump to another template. (B) Phosphoimager scans of the products generated by the wild type, -1 and 0 mutant proteins. A 5' end labeled 17-mer primer 
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were analyzed on denaturing 14% polyacrylamide-urea gels. All three proteins added additional non-templated nucleotides (NT) beyond the band labeled 
full-length (FL) and the wild-type and -1 mutant proteins conducted template jumps to another template or the excess primer in the assay (bands labeled 
Jumps). The ladder of bands seen below the FL band corresponds to low levels of R2 protein dissociating from the template after each polymerization step 
(11). 



nt 3' UTR of the R2 element. Priming of reverse transcrip- 
tion by the DNA nick does not require complementarity 
between the RNA template and the DNA target (5,6). To 
test the ability of the -1 and 0 region mutant proteins to 
conduct the TPRT reaction, the same 5' end labeled DNA 
target site used to test DNA cleavage (Figure 3) was incu- 
bated with the R2 protein in the presence of a 270 nt RNA 
containing the 3' UTR region of the R2 element and dNTPs 
to allow reverse transcription. As diagramed in Figure 6A 
initial cleavage of the bottom strand of the DNA substrate 
results in a labeled 60 nt fragment, and the utilization of this 
nick to prime reverse transcription lengthens this DNA to 
330 nt if polymerization continues to the end of the input 
RNA. Because the assay was designed to measure the abil- 
ity of the protein to bind RNA, the amount of R2 protein 
in the assay was held constant at a level that resulted in 50% 
cleavage of the target site and the amount of R2 RNA was 
increased. 



As shown in Figure 6B, wild-type R2 generated signif- 
icant levels of TPRT product at all RNA concentrations. 
On the other hand, the -1 and 0 region mutants generated 
only low levels of TPRT products at the highest RNA levels. 
Quantifying the amount of label present in the TPRT band 
relative to the level of cleavage product revealed that the - 
1 mutant had less than 5% of the wild-type TPRT activity, 
and 0 mutant had less than 10% of the wild-type TPRT ac- 
tivity (Figure 6C). Because the mutant proteins were read- 
ily able to cleave the bottom DNA strand of the target site 
and were capable of reverse transcription once a template 
was primed (Figure 5), this assay suggested that the mutant 
proteins were either unable to bind the RNA, or unable to 
properly position the 3' UTR RNA adjacent to the cleaved 
DNA to allow priming. 

Binding of RNA to the R2 protein 

EMSA were next performed to directly determine whether 
failure of the TPRT assay was a failure of the mutant R2 
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proteins to bind the 3' UTR RNA or a failure to position 
the RNA properly for TPRT. These gel shift experiments 
were conduct only with R2 RNAs, as it is difficult to detect 
wild-type R2 protein binding to non-R2 RNA (4,6,9,20). 
We have also previously shown that gel shift assays to mon- 
itor RNA binding are most sensitive when conducted in the 
presence of the DNA target (9,20). The affinity of the R2 
protein for both the DNA target and R2 RNA substrates 
greatly increases in the presence of the other nucleic acid. 
As shown in Figure 7A, in the absence of RNA, incubation 
of the wild-type protein with the 110 bp DNA target site 
generated increased levels of shifted complexes as the pro- 
tein concentration was increased (lanes 1-3). This shifted 
band corresponds to protein bound to the DNA immedi- 
ately upstream of the insertion site (diagram at left) (20,28). 
If, however, the R2 protein was pre-incubated with an excess 
of the 270 nt 3' UTR RNA to allow protein-RNA binding, 
and then the 1 10 bp DNA target site was added, maximum 
levels of a super-shifted complex (protein:RNA:DNA) were 
formed even at the lowest protein concentration (lane 4). 
This finding confirmed that the presence of 3' UTR RNA 
significantly increases the affinity of the R2 protein for the 
DNA target (20). 

Shift assays done with the -1 and 0 mutations revealed 
target DNA binding in the absence of RNA at levels simi- 



lar to that observed for the wild-type protein. In the case of 
the 0 mutant, higher levels of more slowly migrating bands 
were observed indicating a greater tendency for additional 
protein subunits to bind to the DNA target at the high- 
est protein concentration (lane 15). In the presence of the 
R2 RNA, the shifted complexes observed for the -1 mutant 
were identical to those observed for the protein only assays, 
suggesting a greatly reduced ability of this protein to bind 
RNA. In the case of the 0 mutant protein, no RNA-induced 
super-shifted band was observed at low protein concentra- 
tions also consistent with a reduced ability of this protein to 
bind RNA. However, a range of slower migrating bands was 
observed at higher concentrations of protein in the presence 
of RNA compared to the absence of RN A (compare lanes 
15 and 18), suggesting a residual level of RN A binding was 
still possible in the 0 region mutation. 

In addition to the ability of the R2 protein to bind RNA 
corresponding to the 3' UTR of the R2 element, the R2 pro- 
tein can also bind a 300 bp region of RNA from near the 
5' end of the R2 element (4). This 5' RNA binding plays 
an important role in the integration reaction, as it greatly 
increases the affinity of the R2 protein for target DNA se- 
quences immediately downstream of the insertion site (4) 
(Figure 1). Gel shift assays for the binding of the R2 pro- 
tein in association with either the 3' or 5' RNA are shown 
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Figure 6. TPRT assay. (A) Diagram of the TPRT reaction. After bottom 
(first) strand cleavage of the DNA target 5' labeled on the bottom strand, 
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for 20 min at 37°C. (C) Quantitation of the TPRT product generated in 
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in Figure 7B. Unlike the assays in Figure 7A, these assays 
were conducted in the presence of EDTA to prevent com- 
pHcations associated with DNA cleavage. Double-stranded 
DNA cleavage of the target DNA in the presence of RNA 
can be readily seen with the wild-type protein on the native 
gel shown in Figure 7A (lanes 4-6). 

As shown in lanes 3-6 of Figure 7B the wild-type protein 
formed specific super-shifted complexes on the target DNA 
with both the 3' and 5' RNAs. The 5' RNA is not bound 
as readily as the 3' RNA at low concentrations and the spe- 
cific complex formed at high concentrations of 5' RNA mi- 
grates somewhat slower than the complex with the 3' RNA. 
In contrast to the wild-type protein, the -1 mutant showed 
little evidence of super-shifted bands in the presence of ei- 
ther 3' or 5' RNA, suggesting the inability of this protein 
to bind both RNAs. In the case of the 0 mutant, slower mi- 
grating complexes were again detected in the presence and 
absence of RNA suggesting that multiple protein subunits 
are binding to the DNA target. These slower moving com- 
plexes were most abundant in the presence of 3' RN A (com- 
pare lanes 14, 16 and 18), again consistent with low levels 
of RNA binding to the 0 mutant. 

Second strand DNA cleavage 

Cleavage of the two strands of DNA at the R2 insertion site 
is conducted by different R2 subunits (Figure 1) (4). Protein 
bound immediately upstream of the insertion site cleaves the 
bottom DNA strand, while protein bound downstream of 
the insertion site cleaves the top DNA strand. Cleavage of 
the top strand is much less efficient and only occurs if RNA 
is present in the reaction (3). The presence of the RN A does 
not appear to play a direct catalytic role in the DNA cleav- 
age reaction as any RNA sequence will act as a stimulus (3). 
Any RNA sequence can promote binding of the R2 protein 
downstream of the insertion site which enables top strand 
cleavage as long as R2 protein is also bound upstream of 
the insertion site. The 5' R2 RNA is unique, however, in 
that it can promote downstream binding of the protein in 
the absence of upstream binding (4). To assay whether the - 
1 and 0 mutant proteins were able to bind RNA well enough 
to enable top strand cleavage, the 110 bp DNA target was 
5' end labeled on both the top and bottom strands (Figure 
8A). Cleavage of the bottom strand results in a labeled 60 
nt fragment while cleavage of the top strand results in a la- 
beled 48 nt fragment. Cleavage was assayed in the presence 
of 3' RNA, 5' RNA or non-specific RNA (plasmid vector 
sequences). As shown in Figure 8B, wild-type protein was 
able to cleave the top strand in the presence of all three types 
of RNA. With a level of detection of ~5% that of wild-type 
protein cleavage, both the -1 and 0 mutants were unable to 
cleave the top strand in the presence of any of the RNAs. 

DISCUSSION 

Comparison of R2 protein sequences from diverse meta- 
zoans revealed two conserved regions (0 and -1) upstream 
of the series of common motifs that can be identified in all 
reverse transcriptases (21). Analysis of clustered mutations 
which replace some of the most conserved amino acids in 
these two motifs suggest these motifs form part of a RNA 
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binding domain of the R2 protein. Mutations in either mo- 
tif retained the abihty of the R2-RT to extend primers an- 
nealed to RNA or DNA templates but dramatically elimi- 
nated both TPRT activity and the ability of the R2 protein 
to bind RNA in gel shift assays. 

We previously suggested that the 0 motif might enable the 
R2 protein to bind more extensively to template sequences 
upstream of the active site of the RT domain, thus, allow- 
ing the R2-RT to undergo template jumps and displace an 
RNA strand annealed to a DNA template (8-10). Evidence 
for this model can be found in this report, because the 0 
mutant had dramatically reduced ability to undergo tem- 
plate jumps with either RNA or DNA templates (Figure 3). 
However, inconsistent with the previous model the 0 mutant 
was fully capable of displacing an RNA strand annealed 
to a DNA template (Figure 5). Because a small reduction 
in activity of the 0 region mutation was detected in simple 
primer extension assays, the alterations in this motif may 
have induced subtle changes in the catalytic domain of R2- 
RT. On the other hand, the -1 motif mutation effectively 



eliminated all activity of the R2 protein that was known to 
be dependent upon RNA binding without any detectable 
effect on the catalytic activity of the RT domain. Thus, we 
suggest this motif has a high probability of being exclusively 
involved RNA binding. Much work remains to determine 
how one region of the R2 protein is able to bind both the 3' 
and 5' regions of the R2 transcript, yet 3' RNA is positioned 
to be used as template for TPRT, while the 5' RNA changes 
the specificity of the R2 protein for binding to DNA. 

The 0 motif can be found in all Hneages of non-LTR retro- 
transposons (25,29). For example, in mammalian LI ele- 
ments, this motif has been termed the Z segment (29,30). LI 
encoded proteins in all mammals contain minor variants of 
the sequence EL-9aa-SPGPDGF, highly similar to the se- 
quences conserved in all R2 elements (Figure 2). While the 
studies reported here suggested that mutations in this motif 
result in only a slight reduction in the ability of the R2-RT to 
extend a primer, mutagenesis of the human LI protein sug- 
gested this motif was required for reverse transcription on 
poly (A) and poly(G) templates (29). However, there are two 
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concerns regarding this previous study of the LI protein. 
First, the LI protein has proven to be extremely difficult to 
isolate, thus the in vitro assays were conducted with crude 
cell lysates. Because no other activity could be assayed, there 
was no means to insure that the mutant LI proteins were 
stabile. Second, the Ll-RT activity being assayed was not 
dependent upon the addition of a primer, thus, there is un- 
certainty as to how DNA synthesis was initiated in these as- 
says. Based on the experiments presented here and the sim- 
ilarity to telomerase and the RT of mobile group II introns 
described below, we suggest that RNA binding is an impor- 
tant role of the 0 motif in most other non-LTR retrotrans- 
posons including LI. 

The RNA binding motifs N-terminal to the universal 
RT motifs have been identified in both mobile group II in- 
trons and in telomerases. Group II introns have a conserved 
motif (labeled RTO in ref 31) with sequence similarity to 
that of the 0 motif described here. Central to this similarity 
most group II introns contain a G-hydrophobic-D-G mo- 
tif similar to the conserved residues in the R2 protein that 
were mutated in this report (Figure 2). However, using a 
high-throughput screen for mutations of the LI .LtrB intron 
encoded protein that detected reduced binding to intron 
RN A, the protein regions most resistant to change were lo- 
cated to either side of the conserved 0 motif (19). The region 
N-terminal to the motif 0 of LI. LtrB was suggested to be 
the major component of the RNA binding domain. This N- 



terminal region, which could be functionally similar to mo- 
tif -1 of the R2 protein, is only 35 amino acids in length in 
LI. LtrB. In the case of R2, the N-terminal domain appears 
larger as the amino acid mutations that eliminated motif - 
1 functions in R2 are over 70 residues upstream of motif 0 
(Figure 2). 

The more extensive characterization of RNA binding by 
telomerase has included X-ray diffraction studies of the iso- 
lated RNA binding domain as well as the complete telom- 
erase (18,32). Two regions of the protein appear to be in- 
volved in binding to the telomerase RNA. The T motif 
which like R2 and group II introns is located immediately 
upstream of the common RT motif and a second domain, 
CP, which is located about 135 amino acids upstream of T. 
The T and CP motifs interact with regions of the telom- 
erase thumb effectively encircling the RNA template (18). 
The T motif contains several conserved, large hydrophobic 
residues and forms a narrow deep cleft in the protein. The 
T motif has no obvious sequence similarity to the 0 motif 
seen in R2 or LI -LtrB. The CP motif forms a shallow cleft 
that is less conserved in sequence, and similar to the -1 mo- 
tif of R2 has an accumulation of interspersed hydrophobic 
and charged residues. 

Phylogenetic comparisons based on the sequences of the 
conserved domains in all reverse transcriptase-like enzymes, 
suggest that non-LTR retrotransposons, telomerase and 
group II introns are phylogenetically related and separate 
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from the reverse transcriptases of LTR retrotransposons 
and retroviruses (16,21,25,33). The abiHty of each of these 
enzymes to specifically bind an RNA and then to use that 
RNA as a template for reverse transcription primed by the 
3' end of DNA provides additional evidence for the close 
evolutionary relationship of these enzymes. Future stud- 
ies of non-LTR retro transposon RTs should use telomerase 
and group II introns as guides for structure/function stud- 
ies. 
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